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Abstract. It is known that the problem of deleting at most k vertices to obtain a proper 
interval graph (Proper Interval Vertex Deletion) is fixed parameter tractable. However, 
whether the problem admits a polynomial kernel or not was open. Here, we answers this question 
in affirmative by obtaining a polynomial kernel for Proper Interval Vertex Deletion. This 
resolves an open question of van Bevern, Komusiewicz, Moser, and Niedermeier. 

1 Introduction 

Study of graph editing problems cover a large part of parmeterized complexity. The problem 
of editing (adding/deleting vertices/edges) to ensure a graph to have some property is a 
well studied problem in theory and applications of graph algorithms. When we want the 
edited graph to be in a hereditary (that is, closed under induced subgraphs) graph class, 
the optimization version of the corresponding node/edge deletion problems are known to be 
iVP-complete by a classical result of Lewis and Yannakakis [17]. In this paper we study the 
problem of deleting vertices to get into proper interval graph in the realm of kernelization 
complexity. 

A graph G is a proper (unit) interval graph if it is an intersection graph of unitdength 
intervals on a real line. Proper interval graphs form a well studied and well structured hered- 
itary class of graphs. The parameterized study of the following problem of deleting vertices 
to get into proper interval graph was initiated by van Bevern et al. [23]. 

P-Proper Interval Vertex Deletion (PIVD) Parameter: k 

Input: An undirected graph G and a positive integer k 

Question: Decide whether G has a vertex set X of size at most k such that G \ X is a 
proper interval graph 

Wegner [25] (see also [3]) showed that proper interval graphs are exactly the class of 
graphs that are {claw, net, tent, hole} -free. Claw, net, and tent are graphs containing at most 
6 vertices depicted in Fig. 1, and hole is an induced cycle of length at least four. Combining 
results of Wegner, Cai, and Marx [4,19,25], it can be shown that PIVD is FPT. That is 
one can obtain an algorithm for PIVD running in time r(k)n°^ where r is a function 
depending only on k and n is the number of vertices in the input graph. Van Bevern et 
al. [23] presented a faster 0(/c(14/c + 14) fc+1 n 6 ) time algorithm for PIVD using the structure 
of a problem instance that is already {claw, net, tent, C4, C5, Cgj-free. The running time was 
recently improved by Villanger down to 0(6 fc fcn 6 ) [24]. However, the question, whether the 
problem has a polynomial kernel or not was not resolved. This question was explicitly asked 
by Van Bevern et al. [23]. This is precisely the problem we address in this paper. 

Here, we study PIVD from kernelization perspective. A parameterized problem is said 
to admit a polynomial kernel if every instance (I, k) can be reduced in polynomial time to 
an equivalent instance with both size and parameter value bounded by a polynomial in k. In 
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other words, it is possible in polynomial time to "compress" every instance of the problem 
to a new instance of size kP^ l \ 

The study of kernelization is one of the leading research frontier of modern parameter- 
ized complexity and the major recent advances in the area are on kernelization. Important 
recent developments are the introduction of new lower bounds techniques, showing (under 
complexity theoretic assumptions) that certain problems must have kernels of at least cer- 
tain sizes [2,5,10] and general results showing that large classes of problems have small (e.g., 
linear) kernels [1,9,15]. Identifying the intractability borders inside the class FPT is an im- 
portant challenge in parameterized complexity and kernelization provides a way to do it as 
we can classify the problems based on what sizes kernel they admit. From practical side, 
kernelization algorithms often lead to efficient preprocessing rules which can significantly 
reduce and simplify the initial instance [14,26]. Thus, kernelization provides a framework for 
the mathematical analysis of polynomial time preprocessing. 

Our interest to PIVD is also motivated by the following more general problem. Let Q be 
an arbitrary class of graphs. We denote by Q + kv the class of graphs that can be obtained 
from a member of Q by adding at most k vertices. For an example, PIVD is equivalent to 
deciding if G is in Q + kv, where Q is the class of proper interval graphs. There is a generic 
criteria providing sufficient conditions on the properties of class Q to admit a polynomial 
kernel for Q + kv recognition problem. A graph class is called hereditary if every induced 
subgraph of every graph in the class also belongs to the class. Let 77 be a hereditary graph 
class characterized by forbidden induced subgraphs of size at most d. Cai [4] showed that the 
77 + kv problem, where given an input graph G and positive integers k, the question is to 
decide whether there exists a k sized vertex subset S such that G\V \ S] £ IJ, is FPT when 
parameterized by k. The 77 + kv problem can be shown to be equivalent to p-d-HiTTlNG 
Set and thus it admits a polynomial kernel [16]. In the p-d-HiTTiNG Set problem, we are 
given a family T of sets of size at most d over a universe U and a positive integer k and the 
objective is to find a subset S C IA of size at most k intersecting, or hitting, every set of T. 

However, the result of Cai does not settle the parameterized complexity of 77 + kv when 
77 cannot be characterized by a finite number of forbidden induced subgraphs. Here even 
for graph classes with well-understood structure and very simple infinite set of forbidden 
subgraphs, the situation becomes challenging. In particular, for the "closest relatives" of 
proper interval graphs, chordal and interval graphs, the current situation is still obscure. For 
example, the FPT algorithm of Marx [19] for the problem of vertex deletion into a chordal 
graph, i.e. a graph without induced cycles of length at least four, requires heavy algorithmic 
machinery. The question if CHORDAL+fcu admits a polynomial kernel is still open. Situation 
with Interval+/cu is even more frustrating, in this case we even do not know if the problem 
is in FPT or not. 

In this paper we make a step towards understanding the kernelization behaviour for Q+kv 
recognition problems, where Q is well understood and the infinite set of forbidden subgraphs 
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is simple. A generic strategy to obtain FPT algorithm for many Q + kv recognition problems 
is to first take care of small forbidden subgraphs by branching on them. When these small 
subgraphs are not present, the structure of a graph is utilised to take care of infinite family of 
forbidden subgraphs. However, to apply a similar strategy for kernelization algorithm we need 
to obtain a polynomial kernel for a variant of p-d-HiTTiNG Set that preserves all minimal 
solutions of size at most k along with a witness for the minimality, rather than the kernel for 
p-<i-HiTTlNG Set which was sufficient when we only had finitely many forbidden induced 
subgraphs. Preserving the witness for the minimality is crucial here as this "insulate" small 
constant size forbidden induced subgraphs from the large and infinite forbidden induced 
subgraph. In some way it mimics the generic strategy used for the FPT algorithm. Towards 
this we show that indeed one can obtain a kernel for variant of p-d-HiTTiNG Set that 
preserves all minimal solutions of size at most k along with a witness for the minimality 
(Section 3). Finally, using this in combination with reduction rules that shrinks "clique and 
clique paths" in proper interval graphs we resolve the kernelization complexity of PIVD. We 
show that PIVD admits a polynomial kernel and thus resolving the open question posed 
in [23]. We believe that our strategy to obtain polynomial kernel for PIVD will be useful in 
obtaining polynomial kernels for various other Q + kv recognition problems. 

2 Definitions and notations 

We consider simple, finite, and undirected graphs. For a graph G, V{G) is the vertex set of 
G and E{G) is the edge set of G. For every edge uv G E(G), vertices u and v are adjacent or 
neighbours. The neighbourhood of a vertex u in G is Nq(u) = {v \ uv G E}, and the closed 
neighbourhood of u is Ng[u] = Nq(u) U {u}. When the context will be clear we will omit the 
subscript. A set X C V is called clique of G if the vertices in X are pairwise adjacent. A 
maximal clique is a clique that is not a proper subset of any other clique. For U Q V, the 
subgraph of G induced by U is denoted by G[U] and it is the graph with vertex set U and 
edge set equal to the set of edges uv G E with u,v G U. For every U QV,G' = G[U] is an 
induced subgraph of G. By G\X for X C V , we denote the graph G[V \ X]. 

Parameterized problems and kernels. A parameterized problem 77 is a subset of r* x N 
for some finite alphabet r. An instance of a parameterized problem consists of (x, k), where 
k is called the parameter. The notion of kernelization is formally defined as follows. A 
kernelization algorithm, or in short, a kernelization, for a parameterized problem 77 C r* x N 
is an algorithm that, given (x, k) G 7^* x N, outputs in time polynomial in |x| + k a pair 
(x',k') G r* x N such that (a) (x,k) G 77 if and only if (x',k') G 77 and (b) \x'\,k' < g(k), 
where g is some computable function depending only on k. The output of kernelization 
(a/, k') is referred to as the kernel and the function g is referred to as the size of the kernel. 
If g(k) G /fc 0(1) , then we say that 77 admits a polynomial kernel. For general background on 
the theory, the reader is referred to the monographs [6,8,20]. 

Interval graphs. A graph G is an interval graph if and only if we can associate with each 
vertex v G V(G) an open interval I v = (l v ,r v ) on the real line, such that for all v, w G V{G), 
v 7^ w.vw G E(G) if and only if I v n I w / 0. The set of intervals X = {7 t ,}„ g y is called 
an (interval) representation of G. By the classical result of Gilmore and Hoffman [12], and 
Fulkerson and Gross [11], for every interval graph G there is a linear ordering of its maximal 
cliques such that for every vertex v, the maximal cliques containing v occur consequently. 
We refer to such an ordering of maximal cliques C±, C2, ■ ■ ■ , C p of interval graph G as a clique 
path of G. Note that an interval graph can have several different clique paths. A clique path 
of an interval graph can be constructed in linear time [11]. 
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A proper interval graph is an interval graph with an interval model where no interval is 
properly contained in any other interval. There are several equivalent definitions of proper 
interval graphs. Graph G is a unit interval graph if G is an interval graph with an interval 
model of unit-length intervals. By the result of Roberts [22] , G is a unit interval graph if and 
only if it is a proper interval graph. A claw is a graph that is isomorphic to K\-^, see Fig. 1. A 
graph is claw-free if it does not have a claw as an induced subgraph. Proper interval graphs 
are exactly the claw- free interval graphs [22]. 

A vertex ordering a = (u±, . . . , u n ) of graph G = (V, E) is called interval ordering if for 
every l<i<j<k<n, ViVk £ E implies VjVk E E. A graph is an interval graph if and only 
if it admits an interval ordering [21]. A vertex ordering a for G is called a proper interval 
ordering if for every for every l<i<j<k<n, ViV^ G E implies ViVj, VjVk & E. A graph 
is a proper interval graph if and only if it admits a proper interval ordering [18]. Interval 
orderings and proper interval orderings can be computed in linear time, if they exist. We will 
need the following properties of proper interval graphs. 

Proposition 1 ([25,3]). A graph G is a proper interval graph if and only it contains neither 
claw, net, tent, nor cycles (holes) of length at least 4 as induced subgraphs. 

A circular-arc graph is the intersection graph of a set of arcs on the circle. A circular-arc 
graph is a proper circular-arc graph if no arc is properly contained in any other arc. 

Proposition 2 ([24]). Every connected graph G that does not contain either tent, net or 
claw or induced cycles (holes) of length 4, 5 and 6 as an induced subgraph is a proper circular- 
arc graph. Moreover, there is a polynomial time algorithm computing a set X of minimum 
size such that G \ X is a proper interval graph. 

The following proposition of proper interval orderings of proper interval graphs follows 
almost directly from the definition. 

Proposition 3. Let a = {v\, . . . , v n ) be a proper interval ordering of G = (V, E). 

1. For every maximal clique K of G, there exist integers 1 < i < j < n such that K = 
{vi,Vi+i, . . . ,Vj-\,Vj}. That is, vertices of K occur consecutively. 

2. For a vertex V£ let i,j be the smallest and the largest numbers such that ViV£,V£Vj £ E, 
then N[ve] = {vi, . . . , Vj} and the sets {vi, . . . , vg] and {v£, . . . , Vj} are cliques; 

3. Let Ci, C2, . . . ,C P be a clique path of G. If Vi G Cj then vi Cj + £ + i, where t > |JV[uj]|. 

3 Sunflower Lemma and minimal hitting sets 

In this section we obtain a kernel for a variant of p-d-HiTTiNG Set that preserves all minimal 
solutions of size at most k along with a witness for the minimality. Towards this we introduce 
the notion of sunflower. A sunflower S with k petals and a core Y is a collection of sets 
{Si, S2, • • • , such that Si n Sj = Y for all i / j; the sets S%\Y are petals and we require 
that none of them be empty. Note that a family of pairwise disjoint sets is a sunflower (with 
an empty core). We need the following algorithmic version of the classical result of Erdos 
and Rado [7]. 

Lemma 1 ([8]). [Sunflower Lemma] Let T be a family of sets over a universe U each of 
cardinality at most d. If \J-\ > dl(k— l) d then T contains a sunflower with k petals and such 
a sunflower can be found in 0{k + time. 
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A subset X of U intersecting every set in F is referred to as a hitting set for F. Sunflower 
Lemma is a common tool used in parameterized complexity to obtain a polynomial kernel for 
p-<i-HiTTlNG Set [8]. The observation is that if F contains a sunflower S = {Si, . . . , Sk+i} 
of cardinality k + 1 then every hitting set of of size at most k of F must have a nonempty 
intersection with the core Y. However, for our purposes it is crucial that kernelization algo- 
rithm preserves all small minimal hitting sets. The following application of Sunflower Lemma 
is very similar to its use for kernelization for p-d-HiTTiNG Set. However, it does not seem 
to exist in the literature in the form required for our kernelization and thus we give its proof 
here. 

Lemma 2. Let J 7 be a family of sets of cardinality at most d over a universe U and k be a 
positive integer. Then there is an 0{\F\(k + \F\)) time algorithm that finds a non-empty set 
F' C F such that 

1. For every Z CU of size at most k, Z is a minimal hitting set of F if and only if Z is a 

minimal hitting set of F' ; and 

2. \F'\ < d\(k + l) d . 

Proof. The algorithm iteratively construct sets "Ft, where < t < \F\. We start with t = 
and Fq = T . For t > 1, we use Lemma 1 to check if there is a sunflower of cardinality k + 2 
in Ft-i- If there is no such sunflower, we stop, and output T' = Tt-\. Otherwise, we use 
Lemma 1 to construct a sunflower {S\, S2, ■ ■ ■ , Sk+2} in Ft-x- We put Ft = Ft-x \ {Sk+2}- At 
every step, we delete one subset of F . Thus the algorithm calls the algorithm from Lemma 1 
at most \F\ times and hence its running time is 0(\F\(k + (J 7 !)). Since F' has no sunflower 
of cardinality k + 2, by Lemma 1, \F'\ < d\(k + l) d . 

Now we prove that for each t > 1 and for every set Z C U, it holds that Z is a minimal 
hitting set for Ft-i of size k if and only if Z is a minimal hitting set for Ft- Since for t = 1, 
Ft—x = F, and for some t < \F\, Ft = F' , by transitivity this is sufficient for proving the 
first statement of the lemma. 

The set Ft is obtained from Ft-i by removing the set Sk+2 of the sunflower {Si, S2, ■ ■ ■ , Sk+2} 
in Ft— i- Let Y be the core of this sunflower. If Y = 0, then Ft-x has n ° hitting set of size 
k. In this case, Ft contains pairwise disjoint sets Si, S2, ■ ■ ■ , Sfc+i and hence Ft also has no 
hitting set of size k. Thus the interesting case is when Y 7^ 0. 

Let Z be a minimal hitting set for Ft-i of size k. Since Ft Q Ft—i, we have that set Z 
is a hitting set for Ft- We claim that Z is a minimal hitting set for Ft- Targeting towards a 
contradiction, let us assume that Z is not a minimal hitting set for Ft- Then there is u G Z, 
such that Z' = Z \ {u} is a hitting set for Ft- Sets Si, S2, ■ ■ ■ , Sk+i form sunflower in Ft, and 
thus every hitting set of size at most k, including Z' , intersects its core Y . Thus Z' hits all 
sets of Ft-i, as it hits all the sets of Ft and it also hits Sk+2 because Y C Sk+2- Therefore, Z 
is not a minimal hitting set in Ft-i, which is a contradiction. This shows that Z is a minimal 
hitting set for Ft- 

Let Z be a minimal hitting set for Ft of size k. Every hitting set of size k for Ft should 
contain at least one vertex of the core Y. Hence But then Z n Sk+2 7^ and thus 

Z is a hitting set for Ft—x- Because Ft Q Ft-i, Z is a minimal hitting set for Ft—x- 

Given a family F of sets over a universe IA and a subset T C U, we define .Fy as the 
subset of F, containing all sets Q £ J 7 such that Q QT. 

4 Proper Interval Vertex Deletion 

In this section, we apply results from the previous section to obtain a polynomial kernel 
for PIVD. Let (G, k) be an instance to PIVD, where G is a graph on n vertices and k is 
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a positive integer. The kernel algorithm is given in four steps. First we take care of small 
forbidden sets using Lemma 2, the second and the third steps reduce the size of maximal 
cliques and shrink the length of induced paths. Finally, we combine three previous steps into 
a kernelization algorithm. 

4.1 Small induced forbidden sets 

In this section we show how we could use Lemma 2 to identify a vertex subset of V(G), 
which allows us to forget about small induced subgraphs in G and to concentrate on long 
induced cycles in the kernelization algorithm for PIVD. We view vertex subsets of G inducing 
forbidden subgraph as a set family. We prove the following lemma. 

Lemma 3. Let (G,k) be an instance to PIVD. Then there is a polynomial time algorithm 
that either finds a non-empty set T C V(G) such that 

1. G\T is a proper interval graph; 

2. Every set Y C V{G) of size at most k is a minimal hitting set for nets, tents, claws and 
induced cycles Ci, 4 < i < 8, in G if and only if it is a minimal hitting set for nets, tents, 
claws and induced cycles Cp, 4 < I < 8, contained in G[T]; and 

3. \T\ <8-8\{k + l) 8 + k. 

or concludes that (G, k) is a NO instance. 

Proof. Let T be the family consisting of all nets, tents, claws and induced cycles Ce , 4 < I < 8, 
of the input graph G. We apply Lemma 2 on J 7 and in polynomial time find J 7 ' such that 

1. Y is a minimal hitting set of J- of size at most k if and only if Y is a minimal hitting set 
of T' of size at most k; and 

2. I.F'1 < 8\(k + l) 8 + k. 

We take T to be the elements contained inside any set of T' . Thus \T\ < 8-8!(/c+l) 8 . In graph 
theoretic terms, this means that every vertex set Y C V(G) of size at most k is a minimal 
hitting set for nets, tents, claws and induced cycles 4 < I < 8, contained in G if and only 
if it is a minimal hitting set for nets, tents, claws and induced cycles Ce, 4 < £ < 8, contained 
in G[T]. Then G\T contains neither tent, net, claw nor induced cycles of length 4, 5 or 6 and 
by Proposition 2, is a proper circular-arc graph. Using Proposition 2, in polynomial time we 
find a minimum size set X of V{G) \ T such that G \ (T U X) is a proper interval graph. If 
the size of \X\ > k, then we conclude that(G, k) is a NO instance. So we assume that \X\ < k. 
Now we add X to T, increasing its size by at most k. This concludes the proof. □ 

In the rest of the coming subsections we assume that 

Gt = G \ T is a proper interval graph and \T\ < 5(k) = 8 ■ 8\(k + l) 8 + k. 



4.2 Finding irrelevant vertices in Gt 

In this subsection we show that if the maximum size of a clique in Gt is larger than (k + 
l)(5(k) + 2), then we can find some irrelevant vertex v G V(Gt) and delete it without altering 
the answer to the problem. More precisely, we prove the following result. 

Lemma 4. Let G and T be as described before. Furthermore, let the size of a maximum 
clique in Gt be greater than e{k) = (k + l)(5(k) + 2). Then in polynomial time we can find 
a vertex v G V(Gt) such that (G, k) is a YES instance to PIVD if and only if (G \ v, k) is 
a YES instance. 
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Proof. We start by giving a procedure to find an irrelevant vertex v. Let K be a maximum 
clique of Gt, it is well known that a maximum clique can be found in linear time in proper 
interval graphs [13]. Let a = (u\, . . . , u n ) be a linear vertex ordering of Gt- By Proposition 3, 
vertices of K form an interval in a, we denote this interval by cr(K). Suppose that \K\ > e(k). 
The following procedure marks vertices in the clique K and helps to identify an irrelevant 
vertex. 

Set Z = 0. For every vertex v G T, pick k + 1 arbitrary neighbours of v in K, say S^, 
and add them to Z. If ?; has at most k neighbors in K, then add all of them to Z. 
Furthermore, add V F , the first k + 1 vertices, and V , the last /c + 1 vertices in cr(K) 
to Z. Return Z. 

Observe that the above procedure runs in polynomial time and adds at most k + 1 vertices 
for any vertex in T. In addition, the procedure also adds some other 2{k + 1) vertices to Z. 
Thus the size of the set Z containing marked vertices is at most (k + l)(5(k) + 2) = e(k). 
By our assumption on the size of the clique we have that K \ Z ^ 0. We show that any 
vertex in K \ Z is irrelevant. Let v G (K \ Z). Now we show that (G, k) is a YES instance to 
PIVD if and only if (G \ v, k) is a YES instance to PIVD. Towards this goal we first prove 
the following auxiliary claim. 

Claim. Let H be a proper interval graph, and P = p±, . . . ,pg be an induced path in H. Let 
u ^ {Pi, • • • ) Pi} be some vertex of H and let Np(u) be the set of its neighbours in P. Then, 
the vertices of \Np(u)\ occur consecutively on the path P, and furthermore, \Np(u)\ < 4. 

Proof. The first statement follows from the fact that H has no induced cycle of length more 
than three and the second statement from the fact that H contains no claw. □ 

Let (G,k) be a YES instance and let X C V{G) be a vertex set such that \X\ < k and 
G \ X is a proper interval graph. Then clearly (G \ v, k) is a YES instance of PIVD as 
\X \ {v }| < k and G \ ({v} U X) is a proper interval graph. 

For the opposite direction, let (G\v, k) be a YES instance for PIVD and let X be a vertex 
set such that \X\ < k and G \ ({v} U X) is a proper interval graph. Towards a contradiction, 
let us assume that G \ X is not a proper interval graph. Thus G \ X contains one of the 
forbidden induced subgraphs for proper interval graphs. We first show that this can not 
contain forbidden induced subgraphs of size at most 8. Let Y be the subset of X such that 
it is a minimal hitting set for nets, tents, claws and induced cycles Cg, 4 < £ < 8, contained 
in G[T]. By the definition of T and the fact that v ^ T we know that Y is also a minimal 
hitting set for nets, tents, claws and induced cycles Cg, 4 < I < 8, contained in G. Thus, the 
only possible candidate for the forbidden subgraph in G \ X is an induced cycle Cg, where 
£ > 9. Now since G \ (X U {v}) is a proper interval graph, the vertex v is part of the cycle 
Ci = {v, wi,W2, • • • , wg}. Furthermore, w\ and wi are the neighbors of v on Cg. 

Next we show that using Cg we can construct a forbidden induced cycle in G \ ({v} U X), 
contradicting that G \ ({v} U X) is a proper interval graph. Towards this we proceed as 
follows. For vertex sets V F and V L (the first and the last k + 1 vertices of a(K)), we pick up 
vertices v F G V F \ X and vertex v L G V L \ X. Because \X\ < k, such vertices always exist. 

Claim. Vertices wi, wg G T U N[v F ] U N[v L }. 

Proof. Let !o fl £ (TU K), a G {1,^}, then because K C N[v F ], we are done. Otherwise w a 
is a vertex of the proper interval graph Gt \ K. Then w a occurs either before or after the 
vertices of K in a. If w a occurs before then w a < v F < v on a. Now since w a has an edge 
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to v and a is a proper interval ordering of Gt, we have that w a v F is an edge and hence 
w a G N[v F }. The case where w a occurs after K is symmetric. In this case we could show that 
w a G N[v L \. □ 

Now with w a , a G {1, £}, we associate a partner vertex p(w a ). If w a G I 1 , then Z n N(w a ) 
contains at least A; + 1 vertices as v G K n N(w a ) is not in Z. Thus there exists z a G 
(ZD N(w a )) \ X. In this case we define p(w a ) to be z a . If u> a ^ T then by Claim 4.2 we know 
that either v F or v is a neighbor to w a . If -u^ is neighbor to w a then we define p(w a ) = v F , 
else p(w) = v L . Observe that p(w a ) G K \ {v} for a G {1, ^}. 

Now consider the closed walk W = {p(wi),wi, . . . ,W£,p(wi)} in G \ ({v} U X). First of 
all W is a closed walk because p{w\) and p(wt) are adjacent. In fact, we would like to show 
that W is a simple cycle in G \ ({v} U X) (not necessarily an induced cycle). Towards this 
we first show that p{w a ) ^ {wi, . . . , wg}. Suppose p(w\) G {wi, . . . ,W(}, then it must be 
u>2 as the only neighbors of W\ on Cg are v and W2- However, v and p(w\) are part of the 
same clique K. This implies that v has 101,102 and W£ as its neighbors on C^, contradicting 
to the fact that is an induced cycle of length at least 9 in G. Similarly, we can also 
show that p(we) £ {w\, . . . ,wg}. Now, the only reason W may not be a simple cycle is 
that p(wi) = p(wi). However, in that case W = {p(w\), w\, . . . , wf\ is a simple cycle in 
G\({v}UX). 

Notice that G[{wi, W2, ■ ■ ■ , wg}\ is an induced path, where t > 8. Let i be the largest 
integer such that iOj G N(p{w\)) and let j be the smallest integer such that Wj G N(p(we)). 
By Claim 4.2 and the conditions that G[{u>i,W2, ■ ■ ■ , we}] is an induced path, w\ G N(p(w\)), 
and W£ G N(p(wej), we get that z < 4, j > I — 3. As I > 8 this implies that i < j, and hence 
G[{vi,Wi, . . . , tOj, ve}] is an induced cycle of length at least four in G \ ({v} U X), which is a 
contradiction. Therefore, G \ X is a proper interval graph. □ 

4.3 Shrinking Gt 

Let (G,k) be a YES instance of PIVD, and let T be a vertex subset of G of size at most 
S(k) such that Gt = G \ T is a proper interval graph with the maximum clique size at most 
e(k) = (k + + 2). The following lemma argues that if Gt has sufficiently long clique 

path, then a part of this path can be shrunk without changing the solution. 

Lemma 5. Let us assume that every claw in G contains at least two vertices from T and that 
there is a connected component of Gt with at least = (d(k)(8e(k) + 2) + 1^ (^2[e(k)] 2 + 

32e(/c) +3^ maximal cliques. Then there is a polynomial time algorithm transforming G into 
a graph G' such that 

— (G, k) is a YES instance if and only if (G' , k) is a YES instance; 
-\V(G')\<\V(G)\. 

Proof. Let / be a connected component of Gt with ((k) maximal cliques and let Gl, G2, . . . , C p , 
P > C(k), be a clique path of /. We first show that every vertex v of I belongs to at most 
2e(k) maximal cliques of /. We know that every clique of I has size at most e{k) and by the 
property 2 of Proposition 3 we have that the neighborhood of any vertex v can be covered 
with at most 2 cliques. Thus, |iV/(u)| < 2e(k). Now, by the last property of Proposition 3, 
we have that if j is the least integer such that v G Cj, then v G" Cj + £ + i for £ > \Nj(v)\. This 
proves our claim. For every vertex v G T, we mark all maximal cliques of I containing at 
least one neighbour of v. Let m(v) be the set of maximal cliques marked for vertex v. We 
claim that 

VuGT, \m(v)\ <8e{k) + 2. (1) 
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Every vertex of / is in at most 2e(k) maximal cliques, and thus in every set of 4e(k) + 1 
maximal cliques, no vertex of the first clique (in the ordering of the clique path) can be 
adjacent to a vertex of the last clique. Thus in every set of 8e(k) + 2 maximal cliques, it 
is always possible to select three cliques such that no vertex of one clique is adjacent to a 
vertex of another. Thus, if m(v) consisted of at least 8e(k) + 2 maximal cliques, vertex v 
would have at least three neighbours in / which are pairwise non-adjacent. In other words, 
these three vertices together with v, form a claw. This claw contains exactly one vertex from 
T, contradicting the assumption that every claw in G has at least two vertices from T. This 
proves (1). 

By (1), the total number of marked maximal cliques in I is at most \T\(8e(k) + 2) = 
S(k)(8e(k) + 2). By the pigeonhole principle, the clique path C\, C2, ■ ■ ■ , C p contains at least 

""' ! -2[e(k)} 2 + 32e(A0 + 3 



5(k)(8e(k) + 2) + l 

consecutive unmarked maximal cliques, i.e. cliques containing no vertices adjacent to vertices 
of T. Let Cj, Cj + i, . . . , Cj + ^_i be a set of unmarked consecutive cliques. Let q = 16e(k) + 1. 
Then for every v £ Ci and u £ Cj+g, the distance between v and u in G is at least 9. This is 
because every shortest path from v to u should contain at least one vertex either from each 
of the cliques from d, Cj+i, • • • , Ci+ q or from each of the cliques Ci+ q , d+ q +i, . . . , Ci + i~\. In 
both cases, every shortest path goes through at least q = 16e(k) + 1 maximal cliques, and 
then by Proposition 3, the length of such path is at least 9. By similar arguments, for every 
v £ Ci+£-i- q and u £ Ci+t—i, the distance between v and u in G is also at least 9. 

Clique Cj+ g is maximal, and thus it contains a vertex x which does not belong to d+q+l- 
Similarly, let y £ d+i-i—qXCi+i-i—q—i- We compute the minimum size s of an x, y-separator 
in I. Let us note, that in proper interval graphs such a separator is an intersection of two 
consecutive cliques in the clique path, and can be found in polynomial time. We construct a 
new graph G' from G as follows. In this new graph G', the connected component I of proper 
interval graph Gt is replaced by a smaller proper interval graph I'. The proper interval graph 
I' is formed by cliques Ci, C2, . • ■ , Ci +q and Cj + ^_i_ 9 , Cj + ^_ 9 , . . . , C p , and a new clique C of 
size s. We make all vertices of C adjacent to all vertices of Ci+ q and of Cj+^_i_ g . Let us note 
that because there are at least 2[e(k)] 2 + 1 maximal cliques in clique path between Ci +q and 
Ci + i-i- q , there are at least e(k) + 1 vertices belonging only to these cliques. Thus I' has less 
vertices than I. 

It is easy to check that I' is a proper interval graph. The construction of the graph I', 
and hence the graph G', can be done in polynomial time. Indeed, marking maximal cliques 
can be performed in polynomial time, and then computing the size of a minimal separator 
in I can be also performed in polynomial time. 

What remains is to argue that (G,k) is a YES instance if and only if (G',k) is a YES 
instance. Let R = V(I) \ (V(I') U C) be the vertices of G removed during construction of 
graph G' . Vertices of R cannot be contained in any of the forbidden small induced graphs: 
claw, net, tent, and cycles Ci, £ < 8. The reason to that is that the distance from any vertex 
v of R to any vertex of T is at least 9, and thus if such a small induced graph contains v, 
it should be a subgraph of the proper interval graph /, which is a contradiction. Thus every 
set which hits small forbidden induced subgraphs in G also hits them in G' and vice versa. 
So in further arguments we concentrate only on induced cycles (holes) of length at least 9. 

Let X be a set of size at most k such that G \ X is a proper interval graph. We assume 
that the set X is a minimal set with such properties. If X intersects R, this is because X 
hits some cycles passing in G through R. We claim that in this case, X should contain at 
least s vertices from R. Let v £ X R. By minimality of X, there should be a witness hole 
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Q of length at least 9 and such that Q n X = {v}. Let u and w be the neighbours of v in 
Q. Then the set of vertices Nq(u) n Nq(w) is a subset of X. Indeed, if there was a vertex 
v 1 6 (Nq(u) n Ng{w)) \ X, then the cycle obtained from Q by replacing u with t>' is a hole 
of length at least 9 avoiding X. On the other hand, by making use of Proposition 3, it easy 
to show that the set Nq(u) n Nq(w) separates x and y, and thus \Ng(u) n iVc?(u;)| > s. 
Therefore either \X n i?| > s, or X n i? = 0. 

If |X n i?| > s, then X' = (X \ R) U C is of size at most |X|. We claim that G' \ X' is 
a proper interval graph. Indeed, every induced subgraph of G' which is forbidden for proper 
interval graphs either intersects C, or should be also a subgraph of G \ R. But in both cases, 
this subgraph is hit by X' . 

Suppose that R n X = 0. We know that G' \ X' cannot contain a claw, net, tent, and 
induced cycles Q, £ < 8, because each of these subgraphs is entirely in G \ R, and thus is hit 
by X' . If G' \ X' contain an induced cycle Q of length at least 9, then this cycle cannot be 
entirely in G \ R and thus should touch C. Because Q is an induced cycle, it should contain 
a path passing through a vertex a £ Ci+ q , then continuing trough a vertex from C, and 
through a vertex b G Cj+.g-i-g. In G vertices a and b are also connected by a path whose 
vertices use only vertices of R, and thus avoiding X. Replacing in Q the a, 6-path passing 
through C by an a, 6-path passing through R, we obtain an induced cycle of length at least 
9 in G avoiding X. But this is a contradiction, and we conclude that G' \ X' is a proper 
interval graph. We have shown that if (G, k) is a YES instance then (G", k) is a YES instance. 

Let X' be a minimal proper interval vertex deletion set of graph G' . If X' intersects C, let 
v € X' fl C. Because X' is minimal, we can select a witness induced cycle Q °f length at least 
9 such that v is the only vertex of X' in Q. Let u £ Ci +q , w £ Cj+£-i- g be the neighbours of 
v in Q. Then every vertex from the set N(u) n N(w) should be in X' too because otherwise 
it would be possible to modify Q into an induced cycle Q of length at least 9 avoiding X. By 
the way we constructed graph G', we have that C = N(u) n N(w), and thus C C X. Let C 
be a minimum x, y-separator in G. The size of C is s, and the set X = {X' \ C) U C is of 
size \X'\. Every forbidden (for proper interval graph) subgraph in G avoiding X'\C should 
contain a path connecting a vertex from Q+q to a vertex from Cj+^_i_ g , and thus is hit by 
C . Thus G \ X is a proper interval graph. If X' does not intersect clique C, then G\X' is 
a proper interval graph. Indeed, every induced cycle of length at least 9 in G containing a 
path P connecting a vertex from Ci+ q to a vertex from Cj + £_i_g, can be transformed to a 
cycle of length at least 9 in G' by replacing P with a path of length 2 passing through C. 
This implies that X' hits every forbidden subgraph in G too. We have shown that if (G 1 , k) 
is a YES instance then (G, k) is a YES instance, which concludes the proof of the lemma. □ 

4.4 Putting all together: final kernel analysis 

We need some auxiliary reduction rules to give the kernel for PIVD. Let T be the family 
consisting of all nets, tents, claws and induced cycles Cg for t £ {4, 5, . . . , 8} of the input 
graph G. 

Lemma 6. Let (G, k) be an instance to PIVD and T be as defined before. Let X be a subset 
of T such that for every x £ X we have a set S x £ T such that S x \ {x} C (V(G) \ T). 
If \X\ > k then we conclude that G can not be transformed into proper interval graph by 
deleting at most k vertices. Else, (G, k) is a YES instance if and only if (G[V \ X],k — |X|) 
is a YES instance. 

Proof. We first argue that X is a subset of every minimal hitting set S' of size at most k for 
T . By the property of the set T we have that S' C T. This implies that any forbidden set 



10 



that is contained in T such that all but one of its vertex is in T must be contained in every 
minimal hitting set of size at most k. This shows that ICS'. 

Suppose (G, k) is YES instance. Then there exists a set P of size at most k such that 
G \ P is a proper interval graph. Hence this is also a hitting set for T . Let P' be a subset 
of P such that P' is a minimal hitting set for T . By the property of the set T we have that 
P' C T. By the arguments in first paragraph we have that X C P' . In fact, X is subset of 
every subset of size at most k such that its deletion makes the graph proper interval. Hence 
(G, k) is a YES instance if and only if (G \X,k — \X\) is a YES instance. This also implies 
that if |X| > k then (G, k) is a NO instance and hence in this case we conclude that G can 
not be made into proper interval graph by deleting at most k vertices. This completes the 
proof. □ 

Now we are ready to state the main result of this paper. 
Theorem 1. PIVD admits a polynomial kernel. 

Before proceedings with the proof of the theorem, let us remind the definitions of all functions 
used so far. 



Proof. Let (G, k) be an instance to PIVD. We first show that if G is not connected then we 
can reduce it to the connected case. If there is a connected component C of G such that C is a 
proper interval graph then we delete this component. Clearly, (G, k) is a YES instance if and 
only if (G \ C, k) is a YES instance. We repeat this process until every connected component 
of G is not a proper interval graph. At this stage if the number of connected components is at 
least k + 1 , then we conclude that G can not be made into a proper interval graph by deleting 
at most k vertices. Thus, we assume that G has at most k connected components. Now we 
show how to obtain a kernel for the case when G is connected, and for the disconnected case 
we just run this algorithm on each connected component. This only increases the kernel size 
by a factor of k. From now onwards we assume that G is connected. 

Now we apply Lemma 3 on G and in polynomial time either find a non-empty set T C 
V{G) such that 

1. G \ T is a proper interval graph; 

2. Y C V(G) of size at most A; is a minimal hitting set for nets, tents, claws and induced 
cycles Ce for £ £ {4, 5, . . . , 8} contained in G if and only if it is a minimal hitting set for 
nets, tents, claws and induced cycles Ci for I G {4, 5, . . . , 8} contained in G[T]; and 



or conclude that G can not be made into a proper interval graph by deleting at most k 
vertices. If Lemma 3 concludes that G can not be transformed into a proper interval graph 
by deleting at most k vertices, then the kernelization algorithm returns the same. 

If the size of a maximum clique in Gt is more than e(k), then we apply Lemma 4 and 
obtain a vertex v G V(Gt) such that (G, k) is a YES instance if and only if (G \ v, k) is a 
YES instance. We apply Lemma 4 repeatedly until the size of a maximum clique in Gt is at 
most e(k). So, from now onwards we assume that the size of a maximum clique in Gt is at 
most e(k). 




3. \T\ < 5{k) 
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Now we apply Lemma 6 on (G,k). If Lemma 6 concludes that G cannot be made into 
a proper interval graph by deleting at most k vertices, then (G, k) is a NO instance and the 
kernelization algorithm returns a trivial NO instance. Otherwise, we find a set X C T such 
that (G, k) is a YES instance if and only if (G\X,k — \X\) is a YES instance. If \X\ > 1 then 
(G \ X, k — \X\) is a smaller instance and we start all over again with this as new instance 
to PIVD. 

If we cannot apply Lemma 6 anymore, then every claw in G contains at least two vertices 
from T. Thus if the number of maximal cliques in a connected component of Gt is more 
than C(k), we can apply Lemma 5 on (G,k) and obtain an equivalent instance (G',k) such 
that |y(G")| < |y(G)| and then we start all over again with instance (G", k). 

Finally, we are in the case, where Gt is a proper interval graph and none of conditions 
of Lemmata 4, 5 and 6 can be applied. This implies that the number of maximal cliques 
in each connected component of Gt is at most and the size of each maximal clique is 
at most e{k). Thus we have that every connected component of Gt has at most ((k)e(k) 
vertices. Since G is connected, we have that every connected component of Gt has some 
neighbour in T. However because Lemma 6 cannot be applied, we have that every vertex in 
T has neighbours in at most 2 connected components. The last assertion follows because of 
the following reason. If a vertex v in T has neighbours in at least 3 connected components 
of Gt then v together with a neighbour from each of the components of Gt forms a claw 
in G, with all the vertices except v in Gt, which would imply that Lemma 6 is applicable. 
This implies that the total number of connected components in Gt is at most 25 (k). Thus 
the total number of vertices in G is at most 25(k)( > (k)e(k). 

Recall that G may not be connected. However, we argued that G can have at most k con- 
nected components and we apply the kernelization procedure on each connected component. 
If the kernelization procedure returns that some particular component can not be made into 
a proper interval graph by deleting at most k vertices, then we return the same for G. Else, 
the total number of vertices in the reduced instance is at most 2k ■ 5(k)((k)e(k), which is a 
polynomial. 

Observe that the above procedure runs in polynomial time, as with every step of the 
algorithm, the number of vertices in the input graph reduces. This together with the fact 
that Lemmata 4, 5 and 6 run in polynomial time, we have that the whole kernelization 
algorithm runs in polynomial time. This concludes the proof. □ 

5 Conclusion and discussions 

In this paper we proved that PIVD admits a polynomial kernel. While resolving the com- 
plexity of the problem from kernelization perspective, we have to admit that in the cur- 
rent form our result is purely of theoretical importance. This is due to the large number 
2k ■ 5(k)((k)e(k) £ 0(k 53 ) of vertices in our kernel. It is possible to improve the sizes of our 
kernel slightly by the cost of tedious case analysis but the challenging open question is if a 
kernel of "reasonably" polynomial size, say k 10 is possible. It seams that for this type of a 
kernel we need completely different techniques. On the other hand, is it possible to prove 
that PIVD does not have a kernel of size k 7 ? 

Another interesting open question is if p- Chord al Graph Vertex Deletion admits a 
polynomial kernel. The problem is known to be FPT by the result of Marx [19]. And finally, 
what about p- Interval Graph Vertex Deletion? We even do not know if the problem 
is FPT. 
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